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A statistical model A4 is a family of probability distributions, characterised by a 
set of continuous parameters known as the parameter space. This possesses natu¬ 
ral geometrical properties induced by the embedding of the family of probability 
distributions into the space of all square-integrable functions. More precisely, by 
consideration of the square-root density function we can regard A4 as a snbman- 
ifold of the unit sphere 5 in a real Hilbert space TC. Therefore, TC embodies the 
‘state space’ of the probability distributions, and the geometry of the given statis¬ 
tical model can be described in terms of the embedding of M in S. The geometry 
in question is characterised by a natural Riemannian metric (the Fisher-Rao met¬ 
ric), thus allowing us to formulate the principles of classical statistical inference 
in a natural geometric setting. In particular, we focus attention on the vari¬ 
ance lower bounds for statistical estimation, and establish generalisations of the 
classical Cramer-Rao and Bhattacharyya inequalities, described in terms of the 
geometry of the underlying real Hilbert space. As a comprehensive illustration of 
the utility of the geometric framework, the statistical model M is then specialised 
to the case of a submanifold of the state space of a quantum mechanical system. 
This is pursued by introducing a compatible complex structure on the underlying 
real Hilbert space, which allows the operations of ordinary quantum mechanics to 
be reinterpreted in the language of real Hilbert space geometry. The application 
of generalised variance bounds in the case of quantum statistical estimation leads 
to a set of higher order corrections to the Heisenberg uncertainty relations for 
canonically conjugate observables. 


1. Introduction 

The purpose of this paper is twofold: first, to develop a concise geometric 
formulation of statistical estimation theory; and second, the application of this 
formalism to quantum statistical inference. Our intention is to establish the basic 
concepts of statistical estimation within the framework of Hilbert space geometry. 
This line of enquiry, although suggested by Bhattacharyya (1942), Rao (1945), 
and Dawid (1975,1977), has not hitherto been pursued in the spirit of the fully 
geometric program that we undertake here. In 1945 Rao introduced a Riemannian 
metric, in local coordinates given by the components of the Fisher information 
matrix, on the parameter space of a family of probability distributions. He also 
introduced the corresponding Levi-Civita connection associated with the Fisher 
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information matrix, and proposed the geodesic distance induced by the metric as 
a measure of dissimilarity between probability distributions. Thirty years after 
Rao’s initial work, Efron (1975) carried the argument a step forward when he 
introduced, in effect, a new affine connection on the parameter space manifold, 
and thus shed light on the role of the embedding curvature of the statistical 
model in the relevant space of probability distributions. The work of Efron has 
been followed up and extended by a number of authors (see, e.g., Amari 1982, 
1985, Barndorff-Nielsen, Cox, and Reid 1986, and Kass 1989), particularly in 
the direction of asymptotic inference. However, the applicability of modern dif¬ 
ferential geometric methods to statistics remains in many respects a surprising 
development, about which there is still much to be learned. 

In a remark on Efron’s construction, Dawid (1975) asked whether there might 
be a fundamental role played by the Levi-Civita connection in statistical analysis. 
The aim of this paper in part is to answer this important question, by studying 
statistical inference from a Hilbert space perspective. In particular, we shall study 
the geometric properties of a statistical model M induced when we embed via 
the square-root map in the unit sphere 5 in a real Hilbert space 7i. This leads in 
a natural way to the Levi-Civita connection on Ai. 

It was also pointed out by Dawid (1977), in the case of an embedding given by 
the square-root of the likelihood function, that the Hilbert space norm induces a 
spherical geometry (see also Burbea 1986). If the density function is parameterised 
by a set of parameters 9, then for each value of 0 we have a corresponding point 
on the unit sphere S in the Hilbert space 7i. By choosing a basis in 7i, we can 
associate a unit vector ^“(0) with this point, and work with the abstract vector 
instead of ^/pg{x). The index ‘a’ is abstract in the sense that we do not 
necessarily regard it as ‘taking values’; instead, it serves as a kind of ‘place-keeper’ 
for various tensorial operations. We show how the abstract index approach can 
be used as a powerful tool in statistical investigations. 

Our program includes the exploitation of this methodology to study geometrical 
and statistical aspects of quantum mechanics. The specialisation to quantum 
theory requires an extra ingredient, namely, a complex structure. Thus, if we take 
our real Hilbert space and impose on it a complex structure, compatible with the 
real Hilbert space metric, the resulting geometry is sufficiently rich to allow us 
to introduce all of the standard operations of quantum theory. 

While the conventional approach to quantum statistical estimation has essen¬ 
tially been merely ‘by analogy’ with classical estimation, our approach differs 
in the sense that we view quantum estimation theory as arising in essence as a 
natural extension of the classical theory, when the theory is ‘enriched’ with the 
addition of a complex structure, and the system of random variables is expanded 
to include incompatible observables. 

By way of contrast we note that most of the current literature of quantum 
statistical estimation (see, e.g., Accardi and Watson 1994, Braunstein and Caves 
1994, Brody and Meister 1996a,b, Helstrom 1976, Holevo 1982, Ingarden 1981, 
Jones 1994, Malley and Hornstein 1993, Nagaoka 1994, and references cited 
therein) takes the space of density matrices as the relevant state space in terms of 
which estimation problems are formulated, the view there being that the ‘space 
of density matrices’ is the quantum mechanical analogue of the ‘space of density 
functions’ when we consider the quantum estimation problem. 
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In our approach, however, we find it useful to emphasise the role of the space 
of pure quantum states. In fact, the space of density matrices has a very compli¬ 
cated geometric structure, owing essentially to the various levels of ‘degeneracy’ 
a density matrix can possess, and the relation of these levels to one another. It 
can be argued that to tackle the quantum estimation problem head-on from a 
density matrix approach is not necessarily advantageous. In any case, the consid¬ 
eration of pure states allows us to single out most sharply the relation between 
classical statistical theory and quantum statistical theory, and in such a way that 
the geometry takes on a satisfactory character. The extension of our approach to 
general states will be taken up elsewhere. 

The plan of the paper is as follows. In §2, the geometry of the parameter 
space induced by the Hilbert space norm is introduced by means of an index 
notation. This notation is employed here partially for the purpose of simplifying 
complicated calculations, and its usefulness in this respect will become evident. 
The index notation also greatly facilitates the geometrical interpret atio n of the 
operations being represented here. Attention is drawn to formula (|2.5D for the 
Riemannian metric on M, and the argument given in Proposition |2pthat indi¬ 
cates the special status of the Levi-Civita connection. Our idea is to reformulate 
a number of the standard concepts of statistics in the language of Hilbert space 
geometry. In particular, in §3 and §4 we develop the theory of the maximum 
likelihood estimator (MLE) and the Cramer-Rao (CR) variance lower bound, for 
which novel geometrical interpretations are provided. See, for example. Propo¬ 
sition ^ and Theorem 1. Also, note Proposition where a striking link is made 
between an essentially statistical quantity and an essentially geometric quantity. 
In §4 we consider in some detail properties of the canonical family of exponen¬ 
tial distributions, which can be described concisely in terms of the Hilbert space 
geometry. This material also has interesting applications to statistical mechanics 
and thermal physics, which we discuss elsewhere. 

In §5, a set of higher-order corrections to the CR lower bound is obtained, 
leading to what might appropriately be called generalised Bhattacharyya bounds, 
given in Proposition ^ However, unlike the original Bhattacharyya bound, the 
new variance bounds generally depend upon features of the estimator. Neverthe¬ 
less, in certain cases of interest the result is independent of the specific choice 
of estimator. This will be illustrated with examples from problems in quantum 
estimation. A brief account of multi-parameter situation is given in §6. 

After some comments on the transition from classical to quantum theory in §7, 
a general geometric formulation of ordinary quantum mechanics is developed in 
§8 and §9 in terms of a real Hilbert space setting. In §10 and §11 we apply our 
geometric estimation theory to the quantum mechanical state space. We are in¬ 
terested, in particular, in the variance bounds associated with pairs of canonically 
conjugate observables. Here we study in detail the example of time estimation 
in quantum theory. This is pursued by means of a nonorthogonal resolution of 
unity, known as a probability operator-valued measure (POM), which allows us to 
construct a well defined maximally symmetric time ‘observable’ within the frame¬ 
work of ordinary quantum mechanics. Finally in §12, we apply the generalised 
variance lower bounds to obtain a remarkable set of higher order corrections to 
the Heisenberg relations. 
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2. Index Notation and Fisherian Geometry 

Consider a real Hilbert space 7i, equipped with a symmetric inner product 
which we denote gab- As noted above, we adopt an index notation for Hilbert space 
operations. Let us write for a typical vector in 7i. If 7i is finite, the index can 
be thought of as ranging over a set of integers, while in the infinite dimensional 
case, the index is ‘abstract’. See Geroch (1971a,b), Penrose and Rindler (1984, 
1986), or Wald (1994) for further details of this notation. Our intention here is 
not to present a rigourous account of the matter, which would be beyond the 
scope of the present work, but rather to illustrate the utility of the index calculus 
by way of a number of examples. In particular, in the infinite dimensional case 
there are technical conditions concerning the domains of operators that require 
care—these will not concern us here in the first instance, though in our treatment 
of quantum estimation more attention will be paid in this respect. 

Suppose we consider the space of all probability density functions p{x) on the 
sample space R”. By taking a square root we can map each density function to 
a point on the unit sphere S in 7{ = L‘^(RP), given by gab^^"^^ = 1. A random 
variable in 7i is then represented by a symmetric bilinear form, e.g., Xab, with 
expectation Xabi^^^ in the state that is, 

E^[X] = Xab^e • (2.1) 

In terms of the conventional statistical notation, we can associate withp(x)^/^, 
Xab with x5{x — y), and hence Xabi°'i^ with the integral 

f j x5{x — y)p{x)^^^p{y)^^'^dxdy , (2.2) 

J X J y 

which reduces to the expectation. This line of reasoning can be extended to 
more general expressions. Thus, for example, XahX^^^^^ is the expectation of the 
square of the random variable Xab, and the variance of Xab in the state is 

Var^iX] = Xac^i^e , (2.3) 

where Xab = Xab — 9ab{XcdC^C'^) represents the deviation AX of the random 
variable from its mean. Likewise for the covariance of the random variables Xab 
and Yab in the state we can write 

Covg[X,y] = XacYbT^’’ . (2.4) 

Note that if is not normalised, then the formulae above can be generalised 
with the inclusion of suitable normalisation factors. 

We consider now the unit sphere S in 7i, and within this sphere a submanifold 
M given parametrically by £“(0), where 0* (i = l,...,r) are local parameters. 
We write di for d/de\ 

Proposition 1. (Fisher-Rao metric). In local coordinates, the Riemannian 
metric Qij on A4, induced by gab, given by 

Gij = ^9ab^^e^Je , (2.5) 

is the Fisher information matrix. 

The proof is as follows. We note that the squared distance between the end¬ 
points of two vectors and 77 “ in Ti is = gab{^^ ~ ~ If both 
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endpoints lie on Ai, and is obtained by infinitesimally displacing in i.e., 
then the separation ds between the two endpoints on Ai is 

ds'^ = ^QijdO^de^ , (2.6) 

where Qij is given as in (^]^). The factor of | arises from the conventional def¬ 
inition of the Fisher information matrix in terms of the log-likelihood function 
l{x\9) = lnp(x|6*), given by 

Qij = f p{x\6)dil{x\6)djl{x\9)dx . (2-7) 

J X 

By differentiating = 1 twice, we obtain an alternative expression for 

Qij, that is, Qij = This formula turns out to be useful in statistical 

mechanics (see, e.g., Brody and Rivier 1995, Streater 1996), where the geometry 
of the relevant coupling constant space can be investigated. The induced geometry 
of Ai can be studied in terms of the metric Qij and our subsequent analysis will 
be pursued on this basis. To start, we note the following result: 

Lemma 1. The Christoffel symbols for the metric connection arising from Q 
are given by T*^ = 4Q^'‘di^^djdkCa- 

This can be obtained by insertion of (|2.5| ) into the familiar formula 

= 2 ^*^ {djQki + dkQji - diQjk) ( 2 . 8 ) 

for the Levi-Civita (metric) connection. Now, let Vj denote the standard Levi- 
Civita covariant derivative operator associated with Qij, for which ViQjk = 0 and 
Vj is torsion free. Then a straightforward calculation shows that 

Q^J = -4eaV,V,r . (2.9) 

A question that naturally arises is, are there any other ‘natural’ connections as¬ 
sociated with the given Hilbert space structure? This requires one to construct 
a tensor of the form Qijk purely from the metric and covariant derivatives of 
the state vector The answer to this question is of relevance, since we would 
like to know whether it is possible to construct a set of affine connections (e.g., 
Amari’s a-connection) purely in terms of the given basic Hilbert space geom¬ 
etry, or whether extra structure is required. Clearly, the only possibilities are 
and iVjVkCa- However, some straightforward algebra leads us 

to the following result. 

Proposition 2. The expressions Vi^'^VjVk^a o,nd iVjVk^a vanish. Thus, 

no natural three-index tensors can be constructed in Hilbert space, and the Levi- 
Civita connection is distinguished amongst possible a-connections. 

The proof is as follows. First, note that VkQij = 0 implies Vfc(Vi^“Vj^a) = 
0, and hence ^k^j^a + ^k^i^a = 0- On the other hand, it follows 
from (|2.9|) , by differentiation, that T/k^a^ Therefore, we 
deduce that VkVi^’^Vj^a = and that = 0. Since 

is antisymmetric over the indices i,j, it follows that 

= l^aRijk'Vie ( 2 . 10 ) 
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where R^jj} is the Riemann tensor, defined by (VjVj — VjVi)Vk = RijJVi for 
any smooth vector field I 4 . However, vanishes in ( |2.10| ), since = Ij 

and that establishes the desired result. 

Therefore, to introduce other affine connections on M, such as Amari’s a- 
connection, additional structure on the given Hilbert space is required. Although 
these ‘artificial’ connections are useful in certain statistical inference problems, 
such as higher order asymptotics, we conclude that from a Hilbert space point of 
view the Levi-Civita connection is the only ‘natural’ connection associated with 
the space of probability measures. 

Note, incidentally, that in the case of a one-parameter family of distributions, 
the Fisher information is given by ^ where the dot denotes differenti¬ 

ation with respect to 9. Thus, the Fisher information is related in a simple way to 
the ‘velocity’ along the given curve in Hilbert space. This is a result that, as we 
see later (cf. Proposition [I^), has profound links with an analogous construction 
in quantum mechanics (Anandan and Aharonov 1990). 


3. Maximum Likelihood Estimation 


Suppose we are given a random variable Xab which takes real values, and told 
that the result of a sampling of Xab is the number x. We are interested in a 
situation where we have a one-parameter family of normalised states ^“(0) char¬ 
acterising the distribution of x. The parameter 9 determines the unknown state 
of nature, and we wish to estimate 9 by use of maximum likelihood methods; that 
is, we wish to associate with a given value of x an appropriate value of 9 that 
maximises the likelihood function. In this section, we present a geometrical char¬ 
acterisation of the maximum likelihood estimator (MLE), which has an elegant 
Hilbert space interpretation once we single out a ‘preferred’ random variable Xab- 


Proposition 3. Given the random variable Xab, normalised state vector 
^°'{9), and the measurement outcome x, the parameterised likelihood function 
p{x\9) is given by 


p{x\9) 



C^bGxp i\{X[ 



d\ . 


(3.1) 


This can be seen as follows. We define the projection operator associated 
with the random variable Xab and the measurement outcome x by 

Xl{X,x) = / exp dA . (3.2) 

V ZTT J —00 

We note that A“(A, x)Ag(A, y) = A^(A, x)(5(x — y), and that Xab can be recov¬ 
ered from Aab{X,x) via the spectral resolution 

/ OO 

xAab{X,x)dx . (3.3) 

-OO 

Then, the likelihood function p{x\9) is the expectation of Aab in the state i.e., 

p{x\9) = Aab^e ■ (3.4) 

Alternatively, p{x\9) can be obtained by taking the Fourier transform of the 
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characteristic function 


4>9W = r^exp 


iXX^„ 


(3.5) 


which leads back to (^). The maximum likelihood estimate 9{x) for 9, assuming 
it exists and is unique, is obtained by solving 


|0=0- = 0 . (3.6) 

Geometrically, this means that, along the curve ^°'{9) on the sphere S, 9{x) max¬ 
imises the quadratic form Conversely, if 9{x) is the MLE for the param¬ 

eter 9, then the corresponding random variable in TC is 

/ OO _ 

9{x)Aab{X,x)dx . (3.7) 

-CXD 

If we let A 2 ,(^“) denote the quadratic form on 7i, then equation (|3.6| ) 

for the MLE can be rewritten as ^“V^Aa, = 0, where the ‘gradient’ operator Vq 
is defined by Va = d/d^°’, so Va{AbcC^^‘^) = 2Aab^^- Thus, for each given value 
of X we can foliate S with hypersurfaces of constant A^,. This leads us to the 
following characterisation of the MLE. 

Proposition 4. The maximum likelihood estimate 9{x) is the value of 9, for 
each given value of x, such that the tangent of the curve ^°'{9) is orthogonal to 
the normal vector of the constant A^ surface passing through the point ^°'{9). 

Thus, we see that maximum likelihood estimation has a characterisation in 
terms of Hilbert space geometry that can be achieved by introducing extra struc¬ 
ture on 77, namely, by ‘singling out’ a particular observable. This is natural in the 
context of some classical statistical investigations, though for quantum statistical 
inference we may wish to avoid the introduction of ‘preferred’ observables. 


4. Cramer-Rao Lower Bound and Exponential Families 

In the case of a general estimation problem, a lower bound can be established for 
the variance with which the estimate deviates from the true value of the relevant 
parameter. Our intention in this section is to present a geometric characterisation 
of this bound. In doing so we also make some observations about the geometry of 
exponential families of distributions, of relevance to statistical physics. Consider 
a curve ^°'(9) in S. We say that a random variable Tab is an unbiased estimator 
for the function t{9) if 

Tabem\0) = r{9) . (4.1) 

For convenience, we define a mean-adjusted deviation operator Tab = Tab — Tgab- 
Note that Tabi°‘i^ = 0, and that the variance of T is given by Var^[T] = TabTai°‘^^. 
Since Tabi^'f!’ = t, we obtain 2Tahi°‘i^ = t, hence 2Tabi°‘i^ = f. Therefore, if we 
define pb = Tab^°'-, we have = t^/ 4. Whence by use of the Cauchy-Schwartz 

inequality (r/“r/a)(^“la) > (^a?“)^, we are led to the following result. 

Theorem 1. (Cramer-Rao inequality). Let T be an unbiased estimator for a 
function t{ 9) where 9 parametrises a one-dimensional family of states ^“(0) in 
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S € TC. Then, the variance lower bound in the state is given by 


Varg[T] > 



(4.2) 


It is clear from the preceding argument that the CR lower bound is attained 
only if = cry® for some constant c, which by rescaling 9 we can set to 1/2 
without loss of generality. Thus, for any curve ^°‘{9) achieving the lower bound, 
we obtain the differential equation 

r = , (4.3) 

The solution of ( ^) is the canonical exponential family of distributions, given 
by the following elegant formula: 




exp[4gr,°]g^ 

^eyap[eT^]q'^qa 


(4.4) 


where the normalised state vector ^“(0) = q°'/{qbO^Y^'^ determines a prescribed 
initial distribution. Without loss of generality we can set = 1- This expression 
leads us to an interesting geometrical interpretation of the exponential family. We 
consider the unit sphere S in TL, with the standard spherical geometry induced 
on it by gah- Let 


r = (4,5) 

be a quadratic form defined on S. Then S can be foliated by surfaces of constant 
r. Since according to (^^) the tangent vector is parallel to the gradient of the 
function r(^“), we conclude that: 


Proposition 5. The canonical exponential family of distributions ^°‘{9), with 
initial distribution q°‘, is given by the unique curve through the point q°‘ that is 
everywhere orthogonal to the family of foliating r-surfaces. 


In particular, as we show in Proposition 6 below, the variance Varg[T] at the 
point is a quarter of the squared magnitude of the gradient of the surface 
through given by VaT. The Fisher information, on the other hand, is four 
times the squared magnitude of the tangent vector to the curve at Since the 
inner product of the tangent vector and the normal vector VqT is the derivative 
f, it follows that Var^[T] > f^/^, the CR inequality. 

Proposition 6. Let Va = d/dff^ denote the gradient operator in TC. Then the 
variance of an unbiased estimator Tab for a function r in the state is given, on 
the sphere S, by 

Varj[r] = ^VarVV. (4.6) 


This can be verified as follows. By definition, we have a quadratic form ( [4.5| ) 
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on TC. Then, by differentiation, we obtain 

from which it follows that 

4^“"^ " =-- 


(4.7) 


(4.8) 


Since the variance of Tab is given by Var^JT] = {TacT'^b~'''‘^9a.b )^°‘^^equation 
([4.60 follows at once after we restrict (|4.8|) to the sphere = 1. 

In the case of the exponential family of distributions, the corresponding density 
function is given by p{x\9) = q{x) exp[x6 — tp{9)], where q{x) is the prescribed 
initial density, and the normalisation constant tpiO) is given by 

/ CO .V 

e^^q{x)dx = In (ex.p[6T^]q^qb] . (4.9) 

-OO ' 


It is interesting to note that the log-likelihood l{x\9) for an exponential family 
has a natural geometric characterisation in Tt. Suppose we consider a multi¬ 
parameter exponential distribution given by 


r(0) 





(4.10) 


Our idea is to construct a random variable lab in Td that represents the log- 
likelihood l{x\9) for this family of distributions. We define the log-likelihood l\ 
associated with an exponential family of distributions by the symmetric operator 


mo) 


r 


E oTi)'> 




(4.11) 


Note that the expectation of lab gives the Shannon entropy, that is, 

r 

sm = uee = ■ ( 4 - 12 ) 

i=i 


The second expression is the familiar one for the Legendre transformation that 
relates the entropy S{9) to the normalisation constant "0(0). In the case of a one- 
parameter family of exponential distributions, the gradient VqT can be written 
|Var = lab^^- In the multi-parameter case this becomes |VarQ) = i^djlab, which 
leads to the following formula for the Fisher information: 

Proposition 7. The Fisher information matrix Qij can be expressed in terms 
of the log-likelihood l\ by the formula Qij = dilacdjFi,i°'f!’■ 

Thus in the case of an exponential family of distributions we find the Fisher- 
Rao metric is given by the covariance matrix of the estimators : 

Qij = {Tiejac-di^l:5ac){Tlj^b-d3mei^ = S^[r(i)%] . (4.13) 
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5. Generalised Bhattacharyya Bounds 


We have observed that the exponential family is the only family capable of 
achieving the variance lower bound, providing we choose the right function t{6) 
of the parameter to estimate. For other families of distributions, the variance 
exceeds the lower bound. In order to obtain sharper bounds in the general situa¬ 
tion, we consider the possibility of establishing higher-order corrections to the CR 
lower bound. Our approach is related to that of Bhattacharyya (1946, 1947, 1948). 
However, in a Hilbert space context, we are led along a different route from Bhat- 
tacharyya’s original considerations, since in his approach the likelihood function 
p{x\9) plays a crucial role. First, we shall formulate a new, Bhattacharyya-style 
derivation of the CR inequality. We note that if T^b is an unbiased estimator 
for the function t{9), then so is Rah = Tab + aii arbitrary constant 

A. We choose the value of A that minimises the variance of Rab- This implies 
A = — and hence 


mm 


(Var^[R]) = Var^[r] - 




(5.1) 


Since Var^[i?] > 0, we are immediately led back to the CR inequality (4.2). 

Now we try to improve on this by incorporating terms with higher-order deriva¬ 
tives. Let us denote the r-th derivative of with respect to the parameter 6 by 
= dJ'^^'/dO^. We write for the projection of orthogonal to and 

to all the lower order derivatives, so = 0 and = 0 for s < r. If Tab 

is an unbiased estimator for t{9), so is the symmetric tensor Rab defined by 


Rab = Tab + X! ^rC(aib)^ 

r 


(5.2) 


for arbitrary constants A^. We only consider values of r such that ^a ' 7 ^ 0, assum¬ 
ing that the relevant derivatives exist and are linearly independent. A straight¬ 
forward calculation leads us to the values of A^ minimising the variance of R, and 
we obtain 


min(Var^ [R]) 


Var5[r]-^ 

r 


gaSr)a^{r)b 


(5.3) 


Since Var^[R] is nonnegative, we thus deduce the following generalised Bhat¬ 
tacharyya bounds for the variance of the estimator: 


Var^[r] 




(5.4) 


This derivation is ‘historical’ in flavour in the sense that it parallels certain 
aspects of the original argument of Bhattacharyya. However, Proposition [fallows 
us to reexpress ( ^.4D in the form of a simple geometric inequality. That is, given 
the gradient vector VqT in TC, the squared length of this vector is not less than the 
sum of the squares of any of its orthogonal components with respect to a suitable 
basis. To this end, we consider the vectors based on the state and its higher 

order derivatives, and form the orthonormal vectors given by 
It follows from the basic relation ( [4.6| ) deduced in Proposition 6 that: 
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Proposition 8. The generalised variance lower bounds for an unbiased esti¬ 
mator T of a function r can be expressed in the form: 


Var^[T] > 


1 (|(^)^V,r)^ 

4 4" 


(5.5) 


Clearly for r = 1 we recover the CR inequality. Unlike the classical Bhat- 
tacharyya bounds, the generalised bounds are not necessarily independent of 
the estimator T. In our applications to quantum mechanics, however, we shall 
indicate some important examples of higher-order bounds that are indeed inde¬ 
pendent of the specific choice of estimator. See Brody and Hughston (1996d) for 
a related example drawn from classical thermal physics where the bounds are 
also systematically independent of the estimator. 


We remark, incidentally, that the denominator terms in equation (5.5) give rise 
to natural geometric invariants. For example, in the case r = 2 we have 


= {ti^gabfKl , 

where is the curvature of the curve ^^{6) in S. In particular, we obtain 

Lemma 2. In the case of the canonical exponential family of distributions, 
specified in equation (4.4), the curvature of is given by: 


(5.6) 


K| = £1 _ £!): _ 1 . 


(r2)2 (r2)3 


(5.7) 


As a matter of interpretation we note that the first term in the right hand side 
of ( ^.7] ) is the kurtosis (measure of sharpness) of the distribution, while the second 
term is the skewness (measure of asymmetry). A classical statistical inequality 
relating these quantities (cf. Stuart and Ord 1994) ensures that > 0. In the 

case of the exponential family we have = 0, i.e., the ‘acceleration vector’ 

^(2)a tangent space of the surfaces generated by constant values of the 

estimator function t{9). 


6. Multiple Parameters 

The geometrical constructions so far considered are based mainly upon one- 
parameter families of distributions. However, for completeness here we sketch 
some useful results applicable to multi-parameter distributions. First, consider the 
case where we estimate a single function t{9) depending upon several parameters 
0*. A straightforward argument shows that the CR inequality then takes the form 

Var^[r] > , (6.1) 

ij 

where = diT{9) and is the inverse of the Fisher information matrix. In a 
more general situation, we might have several estimators T(^a)ab (o^ = Ir'';^) 
labelled by an index a, with T(^a)ab^'^^^ = For an arbitrary set of constants 

Aa, we form the sums Tab = J2a^aT(^a)ab a-Ild t{ 9) = J2a^aTa- R follows that 
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the CR inequality (^) holds for the summed expressions Tab and t{9). However, 
since is constant, the variance of T can be written 


Var^[r] = , (6.2) 

ajS 

where Ca/s = Cov^[r(,^),is the covariance matrix for the estimators T(^a)- 
Therefore, the CR inequality can be rewritten in the form 

-^S"^diTadjTpAaA/s > 0. (6.3) 

0/9 a/3 

Since this holds for arbitrary values of A^, we obtain the following matrix in¬ 
equality for the covariance lower bound. 

Proposition 9. Let T^a) (c^ = 1, 2, • • •, r) be unbiased estimators for the func¬ 
tions Ta{6). The lower bound for the covariance matrix is given by 

CoVg[T(„),r(^)] > g^^diTa,djTfS . (6.4) 


This equation is to be interpreted in the sense of saying that the difference 
between the left and right hand sides is nonnegative definite. 

7. Prom Classical to Quantum Theory 

In the foregoing material, we have reformulated various aspects of parametric 
statistical inference in terms of the geometry of a real Hilbert space. In particu¬ 
lar, the abstract index notation has enabled us very efficiently to obtain results 
relating to statistical curvatures and variance lower bounds. One of the main 
reasons we are interested in formulating statistical estimation theory in a Hilbert 
space framework is on account of the connection with quantum mechanics, which 
becomes more direct when pursued in this manner, thus enabling us in many 
respects to unify our view of classical and quantum statistical estimation. 

The fact that in our approach to classical statistical estimation the geome¬ 
try in question is a Hilbert space geometry is a result that physicists may find 
surprising. This is because the general view in physics is that the Hilbert space 
structure associated with the space of states in nature is special to quantum the¬ 
ory, and has no analogue in classical probability theory and statistics. We have 
seen, however, that a number of structures already present in the classical theory 
are highly analogous to associated quantum mechanical structures; but the cor¬ 
respondence is only readily apparent when the classical theory is reformulated in 
the appropriate geometrical framework. A key point is that if we supplement the 
real Hilbert space TC with a compatible complex structure, then this paves the way 
for a natural attack on problems of quantum statistical inference, and it becomes 
possible to see more clearly which aspects of statistical inference are universal, 
and which are particular to the classical or quantum domain. 

Indeed, there are a number of distinct geometrical formulations of classical 
statistical theory, corresponding, for example, to the various a-embeddings of 
Amari (see, e.g., Amari 1985 or Murray and Rice 1993), but one among these 
is singled out on account of its close relation to quantum theory: the geometry 
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of square-root density functions. This geometry is special because of the way it 
singles out the Levi-Civita connection on statistical submanifolds, as indicated in 
Proposition 2 above. In this way we are led to consider classical statistics in the 
langnage of real Hilbert space geometry, as indicated in the previons sections. The 
real Hilbert space formnlation of standard quantnm theory, on the other hand, 
is in itself a fairly standard construction now, though perhaps not as well known 
as it should be, and in the next section we shall develop some of the formalism 
necessary for working in this framework. 

The specific point of originality in onr approach is to make the link between the 
natural real Hilbert space arising on the one hand in connection with the classical 
theory of statistical inference, with the natural real Hilbert space arising on the 
other hand in connection with standard quantum theory. Once this identification 
has been made, then a number of interesting results can be seen to follow, which 
are explored in some detail here. In particnlar, the theory of classical statistical 
estimation can be extended directly to the quantum mechanical situation, and 
we are able to show how the Cramer-Rao inequality associated with a pair of 
canonically conjngate physical variables can be interpreted as the corresponding 
Heisenberg relation in the qnantum mechanical context. This ties in neatly with 
the important line of investigation in quantnm statistical estimation initiated by 
Helstrom (1969), Holevo (1973, 1979), and others (e.g., Ynen, Kennedy, and Lax 
1975), about which we shall have more to say shortly. One of the most exciting 
results emerging as a by-product of our approach is the development of a series of 
‘improved’ Heisenberg relations, formulated in some detail in the later sections. 


8. Geometry of Quantum States 


Now we turn to quantum geometry. Our goal in this section is to formulate 
standard quantum theory in a geometrical langnage that brings out more clearly 
its relation to the statistical geometry which we have developed in sections 2 to 
6. We start with onr formulation of classical inference, based on a real Hilbert 
space geometry, upon which we will now impose additional structure. Thus in¬ 
stead of ‘completely reformulating everything’ from scratch to develop a quantnm 
statistical theory, as has conventionally been done, we shall essentially accept the 
classical theory, bnt ‘enrich’ it with some extra strnctnre. The essential addi¬ 
tional ingredient that we mnst introduce on our real Hilbert space 7i, in order 
to study quantum mechanical systems, is, more specifically, a eompatible eomplex 
structure. A complex structure on Ti. is given by a tensor satisfying 

AJ% = • ( 8 . 1 ) 


Given this structure we then say a symmetric operator X^b is Hermitian if it 
satisfies the relation 


An alternative way to express the Hermitian condition is = X° 


J' 




( 8 . 2 ) 

which 


states that J*), and commute. This follows from the complex structure identity 
(H and the Hermiticity condition ( |8.2| ). We reqnire that the complex structure 
be compatible with the Hilbert space structure by insisting that the metric Pab is 
Hermitian. As a consequence we have J^^J^^pab = Qcd, which is to be viewed as a 
fnndamental relationship holding between J'), and pab- 
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In order to proceed further it will be useful to make a comparison of the 
index notation being used here with the conventional Dirac notation. In the ‘real’ 
approach to quantum theory, state vectors are represented by elements of a real 
Hilbert space 7i. We find that if and are two real Hilbert space vectors, 
then their Dirac product is given by the following complex expression: 

{V\0 = - igacJ%)i^ ■ (8.3) 

The Hermitian property of gab implies that the tensor PLab = dacJ^ is automati¬ 
cally antisymmetric and invertible, i.e., a symplectic structure, which also satisfies 
the Hermitian condition in the sense that = ^cd- Since the symplectic 

structure Vlah is antisymmetric, it follows then that the Dirac norm agrees with 
the real Hilbertian norm (apart from the factor of two): 

(SIO = . (8.4) 

A real Hilbert space vector can be decomposed into complex ‘positive’ and 
‘negative’ parts, relative to the specified complex structure, according to the 
scheme , where 

( 8 . 5 ) 


In the case of relativistic fields, this decomposition corresponds to splitting the 
fields into positive and negative frequency parts, so occasionally we refer to 
and as the ‘positive frequency’ and ‘negative frequency’ parts of Note 
that and are complex ‘eigenstates’ of the J\ operator, in the sense that 
As a consequence, the Hermitian condition (|8.2|) implies that 
two vectors of the same ‘type’ (e.g., a pair of positive vectors) are necessarily 
orthogonal with respect to the metric gab- In other words, we have gab^^V^ = 0 
for any two positive (negative) vectors ^5; nnd 

For certain purposes it is useful to introduce Greek indices to denote positive 
and negative parts, by writing = (^",^q), where is the complex conjugate 

of Then, we can identify with the Dirac ‘kef vector |^), and with 

the complex conjugate Dirac ‘bra’ vector (^|, and write = (|0>(C|). To be 
more specific, a typical element in the complex Hilbert space is denoted '!/>“, or 
equivalently \'tjj) in the conventional Dirac notation, and an element in the dual 
space is denoted (pa = {p>\- Hence, their inner product is written 
The complex conjugate of the vector ^s'lpa = {tp\, and its norm is then given by 
= (V’lV’). If we denote the splitting of a real Hilbert space TL into positive 
and negative eigenspaces hy TL = TL'^ © TL~, then an ‘operator’ in quantum 
mechanics can be regarded as a linear map from a domain in TL'^ to Ti'^, given, 

e.g., by = ?]“, for which the corresponding complex conjugate operator is 

T°‘p = . Thus, if T is Hermitian, we have and it follows that the 

expectation (T) of T in the state V' is given by 


(V’lV’) 




( 8 . 6 ) 
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and the variance of T is 


Var^[r] 




(8.7) 


where - (T)(5“^. Note that (f) = 0. 

Now let us say more about the Hermitian condition. Having the decomposition 
in mind, we can represent a given second rank tensor Tab (not 
necessarily symmetric, Hermitian, or real) in matrix form by writing 


Tab 


( Aa^lS B£ \ 

y j 


Similarly, the complex structure J\ can be written 


J 


a 

b 





( 8 . 8 ) 


(8.9) 


Thus for the action of the complex structure tensor we find = 

(z^", —Zc^Q,). In other words, the effect of J\ is to multiply the ‘kef part of the 
given state by z, and the ‘bra’ part by —z. Moreover, it can be verified that 


J’^aATad 


-A, 


0/3 


r<oL 

^ /3 



( 8 . 10 ) 


Therefore, the requirement that the tensor Tab should be symmetric implies 
Aafd = ^{q/ 3 )j 71"^ = and B£ = The Hermitian condition then 

implies A^p = 0 and = 0, and the reality condition implies = C^a- 
A symmetric, real Hermitian tensor Tab can be represented in matrix form by 
writing 


Tab 




( 8 . 11 ) 


where T'^^ = Tg“. It follows that the quadratic form Tab^°"fi' for a Hermitian 
tensor Tab is given by TabC'^V^ = + TJ^^^ r]j 3 . In the special case of the 

metric Qab we have 


9ab 



( 8 . 12 ) 


from which it follows that gab^°"<f = + Also, for the symplectic struc¬ 

ture Vtab we obtain 

'7 ) • 

so that ^abi°"f7 = ~ Clearly, we then have 


Cva = ^ {gab + iTtab) > 


(8.14) 


which is consistent with equation ( |8.3| ), if we bear in mind that Vlab is antisymmet¬ 
ric. With these relations at hand, the reformulation in ‘real’ terms of the standard 
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‘complex’ formalism of quantum theory can be pursued in a straightforward, sys¬ 
tematic way. It is worth bearing in mind that in formulating quantum theory 
in real terms in this way we have not altered the results or physical content of 
the theory. These remain unchanged. Our purpose, rather, is to highlight in this 
way certain geometrical and probabilistic aspects of ordinary quantum mechan¬ 
ics that otherwise may not be obvious. For further details of the ‘real’ approach 
to complex Hilbert space geometry and its significance in quantum mechanics, 
see for example Ashtekar and Schilling (1995), Field (1996), Geroch (1971a,b), 
Gibbons (1992), Gibbons and Pohle (1993), Kibble (1978,1979), Schilling (1996), 
Segal (1947), and Wald (1976,1994). 


9. Real Hilbert Space Dynamics 

In this section we take the discussion further by consideration of the quan¬ 
tum mechanical commutation relations, as seen from a ‘real’ Hilbert space point 
of view. This then leads us to a natural ‘real’ formulation of the Schrodinger 
equation. 

If Xab and Yab are a pair of symmetric operators, then their ‘skew product’ 
defined by the expression X^'^Ycb — X^'^Yca is an antisymmetric tensor, and thus 
itself cannot represent a random variable. Nevertheless, in the case of Hermitian 
operators, there is a natural isomorphism between the space of symmetric ten¬ 
sors Xab satisfying J^aJ%^cd = Xab and antisymmetric tensors Kab satisfying 
J'^aJ%^cd = ^ab) and the map in question is given by contraction with J\. This 
follows from the fact that if Xab is symmetric and Hermitian, then Kab = XacJ\ 
is automatically antisymmetric and Hermitian. Conversely, if Kab is antisymmet¬ 
ric and Hermitian, then KacJ% i® automatically symmetric and Hermitian. Thus 
to form the commutator of two symmetric Hermitian operators first we take their 
skew product, which then we multiply by the complex structure tensor to give 
us a symmetric Hermitian operator. After some rearrangement of terms, these 
results can be summarised as follows: 

Lemma 3. The commutator Z = ^[X, K] of a pair of symmetric, Hermitian 
operators X and Y is given by the symmetric Hermitian operator Zab = {XacYbd— 
YacXbd)^^^. 

Note that the symplectic structure flab (or equivalently, the complex structure) 
is playing the role of ‘i’ in the relation Z = i[X,Y] so as to give us a real, 
symmetric, Hermitian tensor Zab- 

The anticommutator W = {X, Y } between two symmetric operators Xab and 
Yab is defined by Wab = This is a more ‘primitive’ operation on the 

space of symmetric tensors since it does not require introduction of a complex 
structure. The basic operator identity 

{{A,B},C}-{A,{B,C}} = [B,[A,C]] (9.1) 

shows that even in the absence of a Hermitian structure the incompatibility be¬ 
tween a pair of random variables can be expressed in terms of the nonassociativity 
of the symmetric product. In other words, we say two random variables A and C 
are compatible iff the left-hand side of o vanishes for any choice of B. 

Now, suppose the Hamiltonian of a quantum mechanical system is represented 
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by a symmetric Hermitian operator Hat- In fact, we need Hat to be self-adjoint 
(a stronger condition), but this need not concern us for the moment. Then for 
the Schrodinger equation we have: 

r = • (9.2) 


Note that again the role of the usual ‘i’ factor is played by the complex structure 
tensor. Expressing this relation in terms of positive and negative parts, we then 
recover the conventional form of the Schrodinger equation together 

with its complex conjugate. In Dirac’s notation this is of course idt\C) = H\^). 


As a consequence of (^) it follows at once that = 0. This is due to the 
Hermitian relation which says that JacH% is antisymmetric. Thus, as expected. 


the Schrodinger equation respects the normalisation = 1- 

Having formulated the conventional quantum dynamics in terms of real Hilbert 
space, we are in a position to make an interesting link with statistical consider¬ 
ations. To begin, we note that the usual phase freedom in quantum mechanics 
can be incorporated by modifying the Hamiltonian according to the prescription 
+ We can take advantage of this freedom by consideration 
of the following result. 


Proposition 10. There is a unique choice of phase such that the tangent 
vector of the dynamical trajectory is everywhere orthogonal to the direction 
This choice of ip minimises the Fisher information 4:gab^°‘C^- 

In fact, the relevant phase factor is given hy ip = Physically, 

this choice of phase fixing implies an adjustment of the mean of the Hamiltonian. 
Clearly, we have = 0, and it is not difficult to see that the same choice of 

ip minimises ^gab^°'^^- In fact, for general ip we have gab^°'^^ = + 

2ipHabC°'^^ + T^gabi°‘i^ 1 from which it follows at once that gabC^C^ is minimised 
for the choice of ip indicated. This result will be used extensively in our work on 
quantum estimation. With this choice of phase the modified Schrodinger equation 
reads 

r = , (9.3) 

where 

fr _ rr (HcdeC''] 

Hab — Plafe I (9.“I) 

represents now the deviation of the Hamiltonian from its mean, in accordance 
with the notation introduced earlier. Note that for the state defined by 
the dynamical equation becomes since J\ commutes with H\. 

Thus, also satisfies the Schrodinger equation. We can think of the complex 
projective space (in general infinite dimensional) formed by projectivising the 
‘positive’ Hilbert space TL^ as being the ‘true’ space of pure states. Then the 
essence of Proposition 0 is that there is a unique ‘lift’ from this projective space 
P(jH'^) to the real Hilbert space 7i such that the tangent vector is everywhere 
orthogonal to both and 

As a matter of interpretation we make the following observation regarding 
the ‘modified’ Schrodinger equation. In the standard treatment of quantum me¬ 
chanics one is taught that the time independent Schrodinger equation is given 


Phil. Trans. R. Soc. Land. A (1996) 




18 


D.C. Brody, L.P. Hughston 


by H\^) = -E|0, whereas the time dependent case can be written by use of 
the correspondence principle E idf- Although generally accepted, the basis 
of this correspondence has to be regarded as somewhat mysterious, and to that 
extent also unsatisfactory. Now, in the modified Schrodinger equation we have 
idt\C) = {H — {H))\^). Hence if the state is time independent, we recover the 
usual time independent equation (H — = 0. In this way, we do not have 

to specify which representation of the canonical commutation relations we work 
with. While in general terms the theory is independent of the specific choice of 
phase, it seems that there is a unique choice of phase that makes everything fit 
in well from a physical point of view, and interestingly we are led to the same 
result from purely statistical considerations. 

Now suppose Bab is a symmetric Hermitian operator and we write B{t) := 
^i{t) [^] the expectation 




Babe{t)e{t) 


(9.5) 


where satisfies the modified Schrodinger equation (^), or equivalently, 

= exp[twhere C“(0) is the state vector at t = 0. Thus B{t) 
represents the expectation of Bab along the given trajectory. It follows that 


dm ^ Cabm 

dt Qcdm 


(9.6) 


along where Cab = {BacHbd — HacBbd)^^'^ is the commutator between Bab 

and Hab- This is the ‘real’ version of the familiar relation d{B)/dt = i{\B,H]). 
It is important to note that use of the ‘modified’ Schrodinger equation does not 
affect this result. 

If Pab and Qab are symmetric Hermitian operators satisfying the commutation 
relation 


{PacQbd - QacPbd) = 9ab , (9-7) 

then we say that Pab is canonically conjugate to Qab, and we refer to o as 
the Heisenberg canonical commutation relation. This would apply, for example, 
when Pab and Qab are the self-adjoint position and momentum operators of a 
quantum system. In fact, the Heisenberg commutation relation (|9.7] ) has to be 
regarded to some degree as formal, since the domain in TC over which (9.7) is 
valid is not necessarily obvious. This point can be remedied by consideration of 
the Weyl relation, which offers a more general and, ultimately, more rigourous 
basis for formulating the concept of canonical conjugacy. In real terms the Weyl 
relation is given by 


exp[-qJ%P^c]^w\pJ''dQ'^e] = exp[pJ'),(Q'’c + exp[-gJ‘’^P'^J , (9.8) 


where p and q are parameters. Note that in the Weyl relation the effect of inter¬ 
changing the two terms on the left is to ‘shift’ the operator by the amount 
The Heisenberg commutation relation (|9.7| ) is then obtained by formally 
differentiating (|9.8|) with respect to p and q, then setting them to zero. 
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10. Quantum Measurement 

We shall now turn to the problem of parametric estimation for quantum me¬ 
chanical states. By expressing quantum theory within a real Hilbert space frame¬ 
work, and studying the corresponding ‘real dynamics’, we take advantage of the 
geometrical formulation of statistical inference outlined earlier. We begin with 
some remarks indicating the general setting for our investigations in quantum 
measurement and quantum statistical estimation. In particular, with a view to 
the parameter estimation problem we shall be considering shortly, it will suffice for 
our purposes to examine the case where we are concerned with the measurement 
of an observable with a continuous spectrum, such as position or momentum. 

We shall consider the situation where the system is in a pure state, characterised 
in real terms by a state vector in a real Hilbert space 7i equipped with an inner 
product Qab and a compatible complex structure J\. It is possible also to consider 
the case where the state is described by a general density matrix, but this is not 
required for present purposes. 

The measurement of an observable is fully characterised in quantum mechan¬ 
ics by the specification of a resolution of the identity. By this we mean a one- 
parameter family Mab{x) of positive symmetric Hermitian operators that inte¬ 
grate up to form the identity operator. Thus we have Mab = Mba, Mab^°‘f,^ > 0 
for any vector G McdJ^aJ\ ~ ^ab, and 


Mab{x)dx = gab 


( 10 . 1 ) 


Then the probability that the observable X represented by the measurement 
Mab{x) lies in the interval a < x < /3, if the state of the system is is given by 

Prob[a < X < P] = / Mab{x)fPi^dx , (10-2) 

Ja 


and for the expectation of X we have 

/ OO 

xMab{x)e^^dx . 

-OO 

The observable Xab itself, on the other hand, is given by 

poo 

Xab = / xMab{x)dx , 


(10.3) 


(10.4) 


from which it follows that Ep[X] = Xabi^"^^- The probability law (|10.2 ) is not 
readily ascertainable from the operator Xab directly, and this is why one needs 
the resolution of the identity Mabix), or equivalently, the density function 

P{x\0 = Mab{x)^°'^^ (10.5) 

for the random variable X, conditional on the specification of the state 

It is interesting to note the relationship between properties of the operator 
Xab defined by the spectral resolution ( |10.4| ) and the corresponding resolution 
of the identity Mab{x). If Xab is a bounded operator, which is to say that there 
exists a constant c such that \Xab ff‘ff \ < cgabf,°‘f,^ for all G H, then there 
exists a unique spectral resolution (10.4) with the following two additional prop¬ 
erties: i) the resolution is orthogonal, or projection valued, in the sense that 
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g°-'^Mac{x)MM{y) = Mcd{x)5{x - y); and ii 


the family of operators Mab{x) has 

compact support in the variable x. 

More generally, if Xab is self-adjoint (but not necessarily bounded) then there 
exists a unique orthogonal resolution of unity Mab{x) such that Xab is given by 
(10.4), and that the domain of the operator Xab consists of all vectors satisfying 


x^Mab{x)i°'i’^dx < oo 


( 10 . 6 ) 


On the other hand, if Xab is maximally symmetric, then there exists a unique 
resolution Mab{x) such that Xab has the spectral representation ( 10.4 ), and the 
expectation of its square is given by 


X^Xbaei^ = 


x‘^Mab{x)Ci’'dx 


(10.7) 


for any state in the domain of Xab, which is given by ( |10.6|) . In this case the 
resolution of the identity is not orthogonal. 

In noting these results we recall that the domain 'V{X) of a densely defined op¬ 
erator Xab consists of those state vectors for which Xabf^ exists, or equivalently, 
for which XaXbcff'f!’ < oo. If Xabf°‘'rf = Xab'rj^'f!’ for all G 'B>{X), then we 

say Xab is symmetric, and write Xab = Xba- Then we dehne the adjoint domain 
T>*{X) to consist of all those vectors for which there exists a vector ((“ such 
that ri^{Xabf^) = for every in 'D{X). For any element 77 “ G T>*{X) we thus 
dehne the adjoint operator X^b, with domain V*{X), by the action X*^7f’ = If 
V{X) = 7i, then Xab bounded; if V{X) = 'D*{X) then we say Xab is self-adjoint. 
For a symmetric operator, ViX) C V*{X). If V{X) C V{Y) and if Xab = Yab 
on T>{X), then we say Yab is an extension of Xab- An operator is said to be max¬ 
imally symmetric if it is symmetric, but has no self-adjoint extension. See, e.g., 
Bogolubov, et. al. (1975) or Reed and Simon (1974) for further details. 

Perhaps it can be stated that one of the most important modern developments 
in the understanding of basic quantum theory was the realisation that general 
measurements are given by positive operator-valued measures (POM), which in¬ 
volve general, nonorthogonal resolutions of the identity in an essential way. At 
the same time, one has to understand the category of observables in quantum me¬ 
chanics to be widened on that basis to include maximally symmetric operators, 
as opposed to merely self-adjoint operators. See, e.g., Davies and Lewis (1970), 
Helstrom (1969), and Holevo (1973, 1979). 

We note this because some of the most interesting parameter estimation prob¬ 
lems in quantum statistical inference involve nonorthogonal resolutions—for ex¬ 
ample, time and phase measurement—for which the relevant estimators are max¬ 
imally symmetric operators characterised by nonorthogonal resolutions. 

In what follows we shall be particularly concerned with measurements associ¬ 
ated with one-parameter families of unitary transformations. In this connection 
we point out that a transformation represents a general rotation of 

the real Hilbert space Ti about its origin if 0°'JD^d9ab = 9cd- A unitary transfor¬ 
mation is characterised by an operator that is both orthogonal, 

in the sense that the metric is preserved, and symplectic, in the sense that the 
symplectic structure Qab is also preserved, so U°'JJ^^Btab = ^cd- This gives us a 
characterisation of unitary transformations in purely real terms. 
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Now suppose U\{9) is a continuous one-parameter family of unitary trans¬ 
formations on TL, satisfying U°‘^{6)U\{9') = U\{9 + 9'). Then there exists a 
self-adjoint operator Fab such that U\{9) = exp[0J“-F'j,]. For example, if Pab 
is the momentum operator in a specific direction, then U\{9) = exp[0J“P'j,] 
can be interpreted as a ‘shift’ operator, and the one-parameter family of states 
^°‘[9) = U\{9)^^ is obtained by shifting the states along the given axis. The ques¬ 
tion arises then, given such a family ^°‘{9), what measurements can we perform 
to determine the relation of the true state of the system to the original state 
A resolution of the identity Mab{x) is said to be covariant with respect to the 
one-parameter family of unitary transformations U^{9) if 

U'^a{G)U%{9)Mad{x) = Mab{x-9). (10.8) 


In that case it is straightforward to verify that the symmetric operator Qab defined 
by 


©afe 



yL)Mab{x)dx , 


(10.9) 


where ^ is an unbiased estimator for the parameter 9 in the sense 

that 




= 9 


( 10 . 10 ) 


for any state vector ^“(0) along the specified trajectory. 

In particular, one can verify that if the symmetric operator Qab is canonically 
conjugate to Fab, then its spectral resolution Mab{x) necessarily satishes the co- 
variance relation ( 10. 8| ). A symmetric operator Qab is defined to be canonically 
conjugate to a self-adjoint operator Fab if for all values of the parameters 9, cj) we 
have 


exp[-0J'),F^Jexp[(/)J'=rf0'’*Jexp[0J'’jF^J = e-x.-p[4>J\{Q^g + 95\)] . (10.11) 

In other words, the unitar y transformation U\{9) has the effect of shifting 0“^, by 
the amount 95\ in (10.11). If Fab and Qab are self-adjoint, then the exponentials 


in (10.11) can be given meaning by a spectral representation, and (10.11) is 
equivalent to the Weyl relation. 0n the other hand, it may be that the given self- 
adjoint operator Fab has no self-adjoint canonical conjugate. Nevertheless there 
may exist a maximally symmetric operator &ab satisfying (10.11|), if we define 


exp[4>J%e''a\ ■= 


[cos((/>x)(5“^ + sin((/)x) J'),] M^ai^)dx 


( 10 . 12 ) 


where Mab{x) is the unique spectral resolution for Qab satisfying the required 
conditions on its first and second moments. This occurs, for example, in the case 
of a Hermitian operator that is bounded below, which admits no self-adjoint 
canonical conjugate, but nevertheless under fairly g eneral conditions admits a 
maximally symmetric canonical conjugate satisfying (10 .11 ). 

Consider, for example, the case of a free particle in one dimension, for which the 
momentum and position operators are denoted P and Q, and the Hamiltonian is 
H = P‘^/2m. Then, H has no self-adjoint canonical conjugate, but it does have 
a well dehned maximally symmetric canonical conjugate, given (Holevo 1982) by 


T = imsign(P)|P| ^^‘^Q\P\ 


(10.13) 
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and it is not difficult to check, at least formally, that TH — HT = i. 

If we work in the usual momentum representation, for which the wave function 
is given by ^(p), assumed normalised, then T>{T) is given by those functions for 
which 



d{({p)\p\ 

dp 


2 

\p\~^dp < oo . 


(10.14) 


If ^{p,t) is a one-parameter family of states satisfying the Schrodinger equation 
idtC = we find that (^(t)|r|^(t)) = t + (^(0)|T|^(0)), which shows that T 
is an estimator for t. We mention this example to illustrate the point that even 
in a simple situation, the construction of the relevant estimator can be a subtle 
matter. 


11. Quantum Estimation 


Suppose now we consider a family of normalised state v ecto rs parame- 

terised by the time t, that satisfy the Schrodinger equation (|9.3D . The curve 
lies on the unit sphere S in the real Hilbert space 7i, and is characterised by 
the fact that it is the unique lift of the quantum mechanical state trajectory in 
the complex projective Hilbert space to the sphere S with the property that it is 
everywhere orthogonal to the direction as indicated in Proposition 0 

(cf. figure 1). Regarding this curve as a statistical manifold, we shall study the 
problem of estimating the time parameter t by use of the geometric techniques 
developed in §§2-5. Let Tab denote an unbiased estimator for t. Thus Tab is a real 
symmetric Hermitian operator satisfying 


TabCm^t) 


( 11 . 1 ) 


for a system that is in the state ^{t). For example, if Tab is maximally symmetric 
and canonically conjugate to Hab, then by the argument of the previous section 
we can make an adjustment of the form Tab Tab + kgab for a suitable constant 
k to remove the bias of Tab, which does not ch ange the conjugacy condition, and 
we are left with an estimator satisfying O. 

Our idea is to apply the generalised Bhattacharyya bounds established in §5 
to the quantum mechanical estimation problem, and consider the possibility of 
establishing sharper variants of the Heisenberg uncertainty relations AH AT > 
1/2 in the case of canonically conjugate variables. 

The geometrical content of the generalised Bhattacharyya bound is that given 
the normal vector Vat to the time-slice surfaces, we can choose a set of orthogonal 
vectors in TC and express the length of this vector in terms of its orthogonal 
components. Then by use of Proposition ^ we can formulate a set of bounds on 
the variance of Tab- In the ‘classical’ setting for parameter estimation in §5, we 
found it natural to consider the orthogonal vectors given by (A; = 1, 2, • • •), 
the k-th. derivatives of the states projected orthogonally to the lower order 
derivatives. These satisfy Q Sfe 'y = 0 for j ^ k, and will be referred to as the 
‘classical system’ of orthogonal vectors. In the quantum mechanical situation, 
the resulting scheme of possible sets of orthogonal vectors is somewhat richer, 
since the complex structure tensor can also be brought into play. In particular, 
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we find that the Cauchy-Riemann field is orthogonal to is 

orthogonal to and so on. Therefore, we can construct a set of orthogonal 
vectors given in terms of and their higher order derivatives. These will 

''(k) 

be referred to as the ‘quantum system’ of orthogonal vectors, and denoted Q ' 
and {k = 0 ,characterised by the property that = 0 , 

= 0 , and = 0 , for j 7 ^ k. 

Before considering the higher order terms, we study the two lowest order terms 
arising from the quantum system, to note the familiar inequalities from standard 
quantum mechanics thus arising. In this case, the variance bound is: 


Varg[r] > 


(eVgt)" ^ (CVg^^ 


( 11 . 2 ) 


Proposition 11. Let ^°‘{t) satisfy the modified Schrodinger equation 
and set Let Tab be an unbiased estimator for t. Then, C°‘'Vat = 1, and 

C^Vat = —2Cov^[H, T], where Cov^[H, T] denotes the covariance of the operators 
Hab and Tab in the state 

The proof is as follows. The fact that ^‘^’Vat = 1 follows directly from the chain 


rule. Alternatively, notice that differentiation of (11.1) with respect to t implies, 
by use of (p^), that 


gabi^i 


acb 


= 1 


for any state vector on the specified trajectory f“(f). On the other hand, f°'Vat = 
J%H\i-Vat by O, and 

which taken together with ( |11.3 ) imply ^°'Vat = 1, as required. It follows then 
from the definition of together with ( |9.3|) and (8.1) that C = Thus 

we have C“Vaf = —2HacC'^^^/CcC'^, which by virtue of the definition of covariance 
given in §2 leads to the result C“Vaf = —2Cov^[H,T]. 

Therefore, if we write := (^|(r^ — (T)^)!^) for the variance of the esti¬ 

mator T, t hen, for the lowest order terms in the quantum variance bound, the 
inequality ( 11 . 2 |) reads 


A^T^A^H’^ > j + Cov'l[H,T] 


(11.5) 


In obtaining this result we have used the fact that the Fisher information in (11.2) 
is given by: G = - ^abe^^? = 

AA^H'^, where A^H^ = Var^[fl] is the variance of the Hamiltonian (squared 
energy uncertainty) in the state In this way, we recover the standard ‘textbook’ 
account of the uncertainty relations (see, e.g., Isham 1995). In particular, the 
second term on the right of (11.5) is usually represented by an anticommutator. 
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via the relation 


Cov^[H,T] = -E^[Hf + fH] 


( 11 . 6 ) 


where H = H — E^[H] and T = T — E^[r]. If we omit the second term in (|11.5|) 
and keep the first term, corresponding to a quantum extension of the standard 
CramCT-Rao lower bound (4.2), we find 


ArT^A.H^ > - 


(11.7) 


The statistical interpretation of this result is as follows. Suppose we are told that 
at t = 0 a quantum mechanical system is in the state ^“(0), and it evolves sub¬ 
sequently according to the Schrodinger equation, with a prescribed Hamiltonian. 
Some time later we are presented with the system (or perhaps a large number of 
independent, identical copies of it), and we are required to make a measurement 
(or a set of identically designed measurements on all the copies) to determine t. 
The measurement is given by an observable Tab characterised by a nonorthogonal 
resolution of the identity Mah{x). The probability that the result T of a given 


measurement lies in the range (a, (3) is Prob[a < T < j3] = Mab{x)^°‘{t)^^{t)dx, 

and for the expectation of T we have [T] = t. Thus, by averaging the re¬ 

sults on all the copies we can approximate the value of t. The variance of T is 
necessarily bounded from below, in accordance with ( 11.7 ). On the other hand, 
the variance of the average of the results on n copies is Hence, by 

making repeated measurements on different copies of the system we can improve 
the reliability of the estimate for t, despite the uncertainty principle. 


12. Higher Order Quantum Variance Bounds 


Some general remarks are in orde r concerning the relations ( 11.5 ). We note 
that although the first term in ( 11.5 ) is independent of the specific choice of es¬ 
timator, the second term involving the covariance between Hab and Tab depends 
on the choice of Tab- Hence this term is often dropped in the consideration of un¬ 
certainty re^tions, although in general the bound must be sharper than what we 
have in (|11.7] ). On the other hand, the reader may have observed that in deriving 
(|11.5|) we have not, in fact, assumed that Tab is canonically conjugate to Hab- We 
have merely assume d tha t Tab is an estimator for t, for the given trajectory ^“(t), 
in accordance with ( 11. 1| ). This is a weaker condition than canonical conjugacy, 
and thus it is legitimate to enquire whether, under the assump tion of canonical 
conjugacy, it might be possible to derive bounds sharper than ( 11.7 ), but never¬ 
theless independent of the specific choice of estimator. Therefore, following the 
general approach outlined in §5, we propose to study contributions from higher 
order Bhattacharyya type corrections to the CR lower bound to search for such 
terms. What we find is that some of the corrections depend upon the choice 
of Tab, while others do not. Those terms that are independent of the choice of 
the estimator contribute to a set of generalised Heisenberg relations for quantum 
statistical estimation. 

Before investigating details, we present some general results useful in obtaining 
higher order corrections. We assume that the state trajectory satisfies the 
dynamical equation 
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Lemma 4. Let denote the n-th derivative of with respect to the time 
parameter t. Then, where denotes the r-th moment 

of the Hamiltonian about its mean. 


This follows directly by differentiation of the Schrodinger equation (^]^), and 
use of the Hermiticity condition for the metric Pab- An example for n = 1 is given 
by the expression Q = 4:gab^°'^^ = 4(if^) for the Fisher information. 

Lemma 5. For a Schrodinger state the even moments of the Hamiltonian 
about its mean are independent of the time parameter. 

Indeed, since {H‘^^) = we obtain dt{H‘^'^) = gabJ°'c^^di^^^'^i 

which vanishes. An elementary consequence of this result is that for arbitrary n 
we have: 

= 0 . ( 12 . 1 ) 

A remarkable result which is essential in finding higher order corrections that 
are independent of the choice of T is the following. 


Proposition 12. Let Tab be canonically conjugate to Hah, and hence an un¬ 
biased estimator for the parameter t. Then 

Tab^in)-^Hb = tgab&^^&^’^ + k , (12.2) 

where k is a constant. Thus for each n, is a constant of the motion 

along the Schrodinger trajectory. 

Proof. We recall from (^) that Tab is canonically conjugate to Hab if {TacHbd — 
HacTbd)^^'^ = 9ab, a relation which can also be written in the symmetric form 

{TacHbd + TbcHad)GF^ = 9ab ■ (12-3) 

Now suppose we differentiate ■ The Schrodinger equation in the form 

= LP’^Hcdff' implies , and thus 

dt{Tab&'^’^&'>^) = 2Tabn^^Hcd&'>‘'&'>^ . (12.4) 


Since Tabfl°‘‘^gcd^^^'^‘^^^^^^ vanishes automatically, we can replace the Hcd on the 
right-hand side of (|12.4|) with H^d- However, according to ( 12. 3|) we have 

2Tabn-^Hcd&^^&^^ = . (12.5) 


On the other hand, Lemma I says that is independent of t. Thus by 

integration of (12.4) we obtain the desired result. 


Lemma 6. Let Tab be canonically conjugate to Hab- Then for odd integers n, 
with m = {n — l)/ 2 , we have: 


2Tabe^^'>e = {-irngabC^^^^f^^'>^ ■ ( 12 . 6 ) 


We sketch the derivation of this result. First, for n = 1, it follows from Tabff^f^ = 
t that 2Tabi°‘f!’ = 1 - 

By differentiating this twice, we find Tab + ^Tabf^'i^ = 0. On the other 
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hand, it follows from Proposition that Tabi°'i^ = + k. Formula (12.1) 

then allows us to deduce that 2Tab^°‘^^ = This gives us the desired result in 
the case n = 3, namely: 2Tab 

If we differentiate Tabi°‘i^ = t five times, we obtain — 

lOTab Then differentiating 2Tabi°'i^ = twice, we find = 

—^Tab from which it follows that = 5Tab However, since 

Tabi°‘i^ = + k, we deduce by use of (|12.lD that 2Tab and 

the desired result follows for n = 5, namely: 2TabC^^'^°'^^ = Higher order 

formulae can be deduced analogously. 

Armed with these results we are now in a position to deduce some higher or¬ 
der corrections to the Heisenberg relations for canonically conjugate observables. 
Again, we consider the measurement problem for the parameter t in the case of 
a one-parameter family of state vectors generated by the Schrodinger evo¬ 
lution ( |9.3|) , with a given Hamiltonian Hab- The observable Tab is then taken to 
be a canonically conjugate unbiased estimator for the parameter t, in accordance 
with the theory developed in §10. 

First we consider the two second order corrections to the variance bound for 
Tab arising when is expanded in terms of the ‘quantum system’ of orthogonal 
vectors, given by and The variance bound then takes the form 


Var^[r] > 


, iCVat) , , (C(2)“Vat)2 


+ 


+ 


where and are given, respectively, by 


+ 


4C(2)“(, 


( 2 ) 


and 


|(2)a ^ ^-a _ 

C Cc 

^(2)a ^ _ (0) 

s sc 


(12.7) 


( 12 . 8 ) 


(12.9) 


Here we have used the fact that in order to obtain the second order (quantum) 
system of orthogonal vectors, we subtract the components of lower order deriva¬ 
tives of and from and (^“. There are only three terms appearing in these 
expressions since = ^'^Ca = 0 and C“Ca = = 0. It can be verified by use 

of (|8.1| ) and the Hermitian condition ( ^.2D for Hab that the norms of ,7(2)a 
(^(2)a agree, and are given by 


|(2)a|p) 


(2)a^(2) _ ^^2^2 j^2 




e > 


where is dehned by 


Kl : = 




3\2 


- 1 


( 12 . 10 ) 


( 12 . 11 ) 


( 772)2 ^^ 2^3 

We note that is the curvature of the corresponding classical ‘thermal’ state 
defined by the differential equation 


dp 




( 12 . 12 ) 
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The term ‘thermal state’ used in this context is meant to suggest that we identify 


the parameter in the differential equation (12.12) with the inverse temperature f5 
(Brody and Hughston 1996d, 1997a). It is interesting to note that the Schrodinger 
trajectories form a family of curves orthogonal to the corresponding classical 
thermal state trajectories, i.e., wherever they meet we have = 0. According 
to the argument outlined in §4, a thermal trajectory comprises an exponential 
family of distributions. However, since ^acH\ is antisymmetric, the Schrodinger 
equation does not generate an exponential family in the t variable. 


The first two terms on the right of (12.7) lead, as we have seen, to the standard 
first order uncertainty relation ( 11. 5| ). Now we proceed to value the second order 
terms. The numerators appearing in the second order terms in ( |12.7 ) can be 
calculated as follows. We consider the term involving first. By use of (11-4) 
and ( 12.8 ), and the fact that = 0, this can be seen to be given by four 

times the square of the expression 


r- 




c - {i%)e TadC 


(12.13) 


Since 2Tabi°‘^^ = 1, we have Likewise |“^g = Thus 

for the first and third terms in (12.13|) we can write 


Tabee - eccTabCe = -Tabee, ( 12 . 14 ) 

which, by Proposition |^, is constant along the Schrodinger trajectory. As a con¬ 
sequence of the Hermitian condition on Tab, we find that 


TabCe = H^TabH^ei^' = 


(12.15) 


wh ere |T , is the anticommutator of T and H^. In deriving the second equality 
in (12.15) we make use of the identity TH^ + H^T = 2HTH, which is valid for 
canonically conjugate operators. On the other hand, the dynamical equation for 
implies that the expression TgfeC'^^^ appearing in the second term of ( |12.13| ) is 
minus the expectation of the anticommutator of T and H. Since 
and we find that ^“^g = {H^) and C“Cg = (H^). Thus, combining 

together these various expressions, we obtain; 


(|(2)°Vg^)2 


1 


( 2 ) 


A{H^)K] 




(12.16) 


Let us turn to the term in (|12.7| ) involving .^Ve find that TabC,°'^^ = 0 and 
TabC^C^ = 0. Since C“^g = the contribution from the (^O)a is thus 


(C(2HVgt)2 ^ 1 ( (H3)2 \ 

4C(2)-cP 4(H2) ■ 


(12.17) 


If we omit the terms contributing from ( |12.16 ), which depend upon the features of 
the specific choice of estimator T, then by consideration of the terms represented 
in ( |12.17| ) we obtain the following sharpened variance bound for Tab which takes 
the form of a generalised Heisenberg relation: 
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Proposition 13. If T and H are canonically conjugate, then the following 
bound applies to the product of their variances in the Schrodinger state 

+ . (12.18) 


The inequality ( 12.18| ) is expressed in terms of natural statistical ‘invariants’, 
namely, the skewness and the curvature K^. Alternatively, we can 

write ( 12.18D directly in terms of the central moments of the Hamiltonian: 

1 / (^ 3^2 \ 


> - 1 + 




(12.19) 


The positivity of the denominator in the correction term can be verified directly 
by noting that this is the squared norm of the state \ip) defined by 


IVi) = 



(F2) 


H-{H^ 


\0 , 


( 12 . 20 ) 


assuming (^|^) = 1, which is nonvanishing providing that does not he in 

the span of H\^) and |^). This also follows from the statistical identity noted in 
connection with formula ( |5.7| ). 

As a further illustration of the general formalism, we exhibit another, distinct 
bound on the variance, independent of the specific choice of estimator for the 
time parameter t, that arises naturally when we consider inequalities based on 
the ‘classical system’ of orthogonal vectors associated with ^“(t). This bound can 
be derived when we examine the third order Bhattacharrya type correction, which 
is given by where is the component of orthogonal 

to and Now we know from Lemma ^ that is constant along quantum 
trajectories, so = 0. Furthermore, so = 0. Likewise, 

since is constant, we have = 0. Thus ^(3)a jg automatically orthogonal 

to and along quantum trajectories. It follows that 


|(3)fl 



( 12 . 21 ) 


We are interested i n th e variance bound obtained by consideration of the first 
and third terms in (5.5): 


Var^[r] > 




( 12 . 22 ) 


Now, = —3(.H^) according to Lemma On the other hand, we have 

= —{H^) and = {H^), from which it follows that 

|0)|(3)a ^ ^ (12.23) 

(i/2) 

Putting these ingredients together, we thus obtain the following correction to the 
Cramer-Rao lower bound (cf. Brody and Hughston 1996b,c). 
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Proposition 14. If T and H are canonically conjugate observable variables, 
then the following inequality holds along the Schrodinger trajectory gener¬ 
ated by H: 

1 / {{H^) - 3{H^)^)^ \ 

4 I {m){H^) - ) 


>71 + 


(12.24) 


This correction is also strictly nonnegative, depends only on the given family of 
probability distributions determined by and is independent of the specihc 

choice of the estimator for time parameter. The fact that the denominator in the 
correction term is positive follows from the observation that it is given by {H'^) 
times the squared norm of the state IV') defined by 


1 le), 


(F 2 ) 


(12.25) 


where (^|^) = 1. It is interesting to note that the numerator in the correction is 
the square of the fourth cumulant of the distribution, usually denoted 72 . The 
distributions for which 72 > 0 are called leptokurtic, and for 72 < 0 platykurtic. 
If the distribution is mesokurtic (72 = 0), then this correction vanishes, and 
an example of such a distribution is the Gaussian. For applications in quantum 
mechanics, we normally expect a distrib ution f or H that is not Gaussian, since 
H is typically bounded from below, so ( 12.24 ) will generally give a nontrivial 
correction. In the case of other canonically conjugate variables, e.g., position 
and momentum, matters are different, and it is possible that a state can have a 
Gaussian distribution in these variables, as in the case of coherent states. 

In order to obtain some simple examples of the sort of numbers that might arise 
in connection with these corrections, suppose we assume that we have a physical 
system for which the energy is not definite, but rather has a known distribution, 
given by a density function p{E). We shall examine the case when the energy has 
a gamma distribution, given by the density function of the form 




E'y 


-1 


(12.26) 


with 0 < E < 00 and u, 7 > 0. This is to say, we have a large number of 
independent, identical systems with a prescribed Hamiltonian operator Hat and 
the Schrodinger state Then, by a set of measurements we can determine the 
distribution of the energy, which is characterised by the density function p{E) 
given by 

1 


piE) = 


r^exp i\{H\-E5\) 


dX 


(12.27) 


\/^ J-c 

For a given probability distribution for the energy, whether a self-adjoint Hamil¬ 
tonian operator with the corresponding spectral resolution exists, or not, is an 
open problem which we hope to address elsewher e. Her e, instead, we rely on the 
simple observation that the gamma distribution ( 12.26| ) appears quite frequently 
in statistical studies, and hence it may help to provide an element of intuition as 
regards the behaviour of the correction terms. In the case of the gamma distribu¬ 
tion, the moments are {H^) = (7 -|- n — l)!/fT "'(7 — 1 )!, and for the corresponding 
lowest relevant central moments we find (H^) = 7 /( 7 ^, (H^) = 87(7 -|-2)/(T^, and 


Phil. Trans. R. Soc. Land. A (1996) 












30 


D.C. Brody, L.P. Hughston 


{ H ^) = 67(87^ + 267 + 24)/( t ®. It follows that the correction term in (12.24) is 
independent of the values of the parameter a. We thus obtain 




1 


1 + 


18 


372 + 47^ + 42 


(12.28) 


In general, for Bhattacharyya style corrections based on the ‘classical system’ of 
orthogonal vectors, the even order contributions turn out to be dependent upon 
the choice of the estimator T, while the odd order corrections are manifestly 
independent of the specific choice of T, and can be expressed entirely in terms 
of central moments of the conjugate observable H. For example, the fifth order 
correction can be shown to take the form (Brody and Hughston 1996c) 

- 3(.H2)2) + H^{8H^H^ - H^) - 
- (# 4 ) 2 ) + - {h^Y) 

(12.29) 

Here we have used the slightly simplified notation for the n-th moment of 
the Hamiltonian about its mean. If we assume that the distribution of the energy 
is given by a basic exponential distribution with probability density p{E) = 
cjexpf— cjS), which corresponds to the value 7 = 1 for the gamma distribution 
(12.26), then the corrections (12.24) and (12.29) lead to the following bound, 
independent of the specific value of a: 


(r2)(#2) > 


1 


^ 9 18,284,176 

46 290,027,815 


(12.30) 


The bounds given by (12.24) and (12.29) are significant inasmuch as they apply 
even if the odd-order central moments of the Hamiltonian vanish, in which case 
(12.18) would no longer extend the standard Heisenberg relation. 

Throughout the discussion here we have confined the argument to considera¬ 
tion of the time measurement problem. In this case we consider the one-parameter 
family of states generated by the Hamiltonian. However, the same line of argu¬ 
ment will apply for other pairs of canonically conjugate observables, such as 
position and momentum. 

The results indicated here can be pursued further in other ways as well, allowing 
us to consider various examples of natural statistical submanifolds of the quan¬ 
tum state space. For example, in a quantum field theoretic context it is natural 
to examine the coherent state submanifold of a bosonic Fock space. The geome¬ 
try of this manifold arises when we consider measurements of the ‘classical’ held 
associated with the POM generated by the family of all coherent states. Another 
interesting line of investigation intimately related to the arguments considered 
here concerns the status of thermodynamic states in classical and quantum sta¬ 
tistical mechanics (Brody and Hughston 1996d, 1997a,b). 
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